Detecting influential observations in principal components and common principal components
نویسندگان
چکیده
Detecting outlying observations is an important step in any analysis, even when robust estimates are used. In particular, the robustified Mahalanobis distance is a natural measure of outlyingness if one focuses on ellipsoidal distributions. However, it is well known that the asymptotic chi-square approximation for the cutoff value of the Mahalanobis distance based on several robust estimates (like the minimum volume ellipsoid, the minimum covariance determinant and the S-estimators) is not adequate for detecting atypical observations in small samples from the normal distribution. In the multi-population setting and under a common principal components model, aggregated measures based on standardized empirical influence functions are used to detect observations with a significant impact on the estimators. As in the one-population setting, the cutoff values obtained from the asymptotic distribution of those aggregated measures are not adequate for small samples. More appropriate cutoff values, adapted to the sample sizes, can be computed by using a cross-validation approach. Cutoff values obtained from a Monte Carlo study using S-estimators are provided for illustration. A real data set is also analyzed. © 2010 Elsevier B.V. All rights reserved.
منابع مشابه
A new weighting approach to Non-Parametric composite indices compared with principal components analysis
Introduction of Human Development Index (HDI) by UNDP in early 1990 followed a surge in use of non-parametric and parametric indices for measurement and comparison of countries performance in development, globalization, competition, well-being and etc. The HDI is a composite index of three indicators. Its components are to reflect three major dimensions of human development: longevity, knowledg...
متن کاملOn convergence of sample and population Hilbertian functional principal components
In this article we consider the sequences of sample and population covariance operators for a sequence of arrays of Hilbertian random elements. Then under the assumptions that sequences of the covariance operators norm are uniformly bounded and the sequences of the principal component scores are uniformly sumable, we prove that the convergence of the sequences of covariance operators would impl...
متن کاملPersian Handwriting Analysis Using Functional Principal Components
Principal components analysis is a well-known statistical method in dealing with large dependent data sets. It is also used in functional data for both purposes of data reduction as well as variation representation. On the other hand "handwriting" is one of the objects, studied in various statistical fields like pattern recognition and shape analysis. Considering time as the argument,...
متن کاملMorphological Comparison of two populations of lake goby Rhinogobius similis Gill, 1859 from Hariroud basin
Knowledge on the fish species is important in habitat protection management. This study was conducted to compare the morphological characteristics of two populations of Rhinogobius similis from Hariroud basin based on landmark morphometric truss network system. A total of 60 individuals from Polkhatoun (30 specimens) and Tafrihgah dam (30 specimens) stations were caught by electrofishing 220 vo...
متن کاملEvaluation and Geographical analysis of the principal components affecting urban economic sustainability, Case study: Cities of Chaharmahal and Bakhtiari Province
Abstract Aims & Backgrounds: Today, economic challenges are one of the most important obstacles to achieving sustainability in the cities of developing countries. Therefore, recognition and geographical analysis of the factors affecting the economic sustainability of cities are among the important goals and priorities of urban and regional planning. Methodology: This research has been done by q...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 54 شماره
صفحات -
تاریخ انتشار 2010